Goto

Collaborating Authors

 time complexity


Fast Bellman Updates for Wasserstein Distributionally Robust MDPs

Neural Information Processing Systems

Markov decision processes (MDPs) often suffer from the sensitivity issue under model ambiguity. In recent years, robust MDPs have emerged as an effective framework to overcome this challenge. Distributionally robust MDPs extend the robust MDP framework by incorporating distributional information of the uncertain model parameters to alleviate the conservative nature of robust MDPs.





Reward Imputation with Sketching for Contextual Batched Bandits

Neural Information Processing Systems

Contextual batched bandit (CBB) is a setting where a batch of rewards is observed from the environment at the end of each episode, but the rewards of the non-executed actions are unobserved, resulting in partial-information feedback.


Supplementary Materials A Complexity Analysis

Neural Information Processing Systems

Our proposed method significantly reduces communication overhead in federated learning. This method poses a trade-off between time and memory complexity. We also provide detailed information about the optimization hyperparameters e.g. In this section, we explore the effect of fitness sparsification i.e. selecting top-k fitness values from the To enable a fair and insightful comparison between the two population sizes, our focus was on assessing performance based on the number of members remaining post-sparsification rather than directly contrasting sparsification rates. Our results underline the crucial role that population size plays in exploring optimal solutions, overshadowing even the significance of compression rate.